Mapping Multi-Layer Baysian LDA to Massively Parallel Supercomputers

نویسندگان

Kuo-wei Hsu

Ching-yung Lin

Jaideep Srivastava

Kuo-Wei Hsu

Ching-Yung Lin

چکیده

LDA, short for Latent Dirichlet Allocation, is a hierarchical Bayesian model for content analysis. LDA has seen a wide variety of applications, but it also presents computational challenges because the iterative computation of approximate inference is required. Recently an approach based on Gibbs Sampling and MPI is proposed to address these challenges, while this report presents the work that maps it to a massively parallel supercomputer, Blue Gene. The work enhances the runtime performance by utilizing special hardware architecture of Blue Gene such as dual floating-point unit and by using general programming/compiling techniques such as loop unfolding. Results from the empirical evaluation using a realworld large-scale data set indicate the following findings: First, the use of dual floating-point unit contributes to a significant performance gain, and thus it should be considered in the design of processors for computationally intensive machine learning applications. Second, although it is a simple technique and most compilers support it, loop unfolding improves the performance gain even further. Since loop unfolding is general enough to be applied to other platforms, this report suggests that compilers should perform loop unfolding in a more intelligent manner.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular simulation of complex systems using massively parallel supercomputers

Massively parallel supercomputers, such as the 150 Gigaflop Intel Paragons located at Oak Ridge National Laboratory and Sandia National Laboratories, make possible molecular simulation of systems of unprecedented complexity and realism. We describe some of the issues related to efficient implementation of molecular dynamics and Monte Carlo simulations on massively parallel supercomputers. The a...

متن کامل

Lattice QCD with Commodity Hardware and Software

Large scale QCD Monte Carlo calculations have typically been performed on either commercial supercomputers or specially built massively parallel computers such as Fermilab’s ACPMAPS. Commodity computer systems offer impressive floating point performance-tocost ratios which exceed those of commercial supercomputers. As high performance networking components approach commodity pricing, it becomes...

متن کامل

A Visual Analytics System for Optimizing Communications in Massively Parallel Applications

Current and future supercomputers have tens of thousands of compute nodes interconnected with high-dimensional networks and complex network topologies for improved performance. Application developers are required to write scalable parallel programs in order to achieve high throughput on these machines. Application performance is largely determined by efficient inter-process communication. A com...

متن کامل

Tuning HipGISAXS on Multi and Many Core Supercomputers

With the continual development of multi and manycore architectures, there is a constant need for architecturespecific tuning of application-codes in order to realize high computational performance and energy efficiency, closer to the theoretical peaks of these architectures. In this paper, we present optimization and tuning of HipGISAXS, a parallel X-ray scattering simulation code [1], on vario...

متن کامل

Multi - Million Particle Molecular Dynamics onMPPsPeter

We discuss the computational diiculties associated with performing large-scale molecular dynamics simulations involving more than 100 million atoms on modern massively parallel supercomputers. We discuss various performance and memory optimization strategies along with the method we have used to write a highly portable parallel application. Finally, we discuss some recent work addressing the pr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Mapping Multi-Layer Baysian LDA to Massively Parallel Supercomputers

نویسندگان

چکیده

منابع مشابه

Molecular simulation of complex systems using massively parallel supercomputers

Lattice QCD with Commodity Hardware and Software

A Visual Analytics System for Optimizing Communications in Massively Parallel Applications

Tuning HipGISAXS on Multi and Many Core Supercomputers

Multi - Million Particle Molecular Dynamics onMPPsPeter

عنوان ژورنال:

اشتراک گذاری